Occam's Razor Just Got Sharper

نویسندگان

  • Saher Esmeir
  • Shaul Markovitch
چکیده

Occam’s razor is the principle that, given two hypotheses consistent with the observed data, the simpler one should be preferred. Many machine learning algorithms follow this principle and search for a small hypothesis within the version space. The principle has been the subject of a heated debate with theoretical and empirical arguments both for and against it. Earlier empirical studies lacked sufficient coverage to resolve the debate. In this work we provide convincing empirical evidence for Occam’s razor in the context of decision tree induction. By applying a variety of sophisticated sampling techniques, our methodology samples the version space for many real-world domains and tests the correlation between the size of a tree and its accuracy. We show that indeed a smaller tree is likely to be more accurate, and that this correlation is statistically significant across most domains.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sharpening Occam's Razor

We provide a new representation-independent formulation of Occam’s razor theorem, based on Kolmogorov complexity. This new formulation allows us to: (i) Obtain better sample complexity than both length-based [4] and VC-based [3] versions of Occam’s razor theorem, in many applications; and (ii) Achieve a sharper reverse of Occam’s razor theorem than that of [5]. Specifically, we weaken the assum...

متن کامل

A Quantitative Occam's Razor

Interpreting entropy as a prior probability suggests a universal but “purely empirical” measure of “goodness of fit”. This allows statistical techniques to be used in situations where the correct theory — and not just its parameters — is still unknown. As developed illustratively for least-squares nonlinear regression, the measure proves to be a transformation of the R statistic. Unlike the lat...

متن کامل

Extending Occam's Razor

Occam's Razor states that, all other things being equal, the simpler of two possible hypotheses is to be preferred. A quanti ed version of Occam's Razor has been proven for the PAC model of learning, giving sample-complexity bounds for learning using what Blumer et al. call an Occam algorithm [1]. We prove an analog of this result for Haussler's more general learning model, which encompasses le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007